System and method for identifying an object

ABSTRACT

A system and method for determining the classification of a signal, or the identification of an object is provided. Based on rough set theory, or data mining, a training data set ( 105 ) is partitioned ( 105 ) and labeled ( 125 ) with a multi-class entropy method. Reducts ( 145 ) are calculated from a sub-set of the best-performing columns ( 130 ) of the partitioned and labeled training set data. These reducts are applied to test signals and combined for each signal classification. The present system and method produces a more accurate, robust and efficient classification result.

RELATED INVENTIONS

[0001] This application claims priority to U.S. provisional applicationserial No. 60/220,768, filed Jul. 21, 2000 which is hereby incorporatedby reference. This application also incorporates by reference Ph.D.Dissertation “High Range Resolution Radar Target Classification: A RoughSet Approach” of Dale E. Nelson and Prof. Janusz Starzyk.

FIELD OF THE INVENTION

[0002] The invention relates generally to a new organization ofknowledge discovery in the information system described by a rough settheory. It provides a method and system for interpreting data andtransforming it into useful information. The invention finds particularapplication to an information system and a method for solving oridentifying pattern recognition problems, such as speech recognition,character recognition and fingerprint recognition. The system and methodare also applicable to data mining and knowledge discovery in any database where knowledge is described by a discrete set of features, such asradar signature data, data from non-destructive inspection techniquesused in industrial quality control operations, medical test data, thestock market as represented by indices such as the Dow Jones Index andother financial parameters, among numerous other possible uses.

BACKGROUND OF THE INVENTION

[0003] Different techniques have been used in the past to extract usefulinformation from a data set. In a data set or information system thatcan be represented as a table, with rows representing objects or signalsassociated with a specific class, and columns representing attributes ofthe objects, a number of methods have been used to identify or classifythose objects in the past.

[0004] For example, High Range Resolution (HRR) radar imaging data, thatcan be used for an Automatic Target Recognition (ATR) system formilitary aircraft, can be represented as a table of data having rowsrepresenting signals with columns representing range bins (this examplewill be further discussed and described below). In the past, one of themost frequently chosen techniques to classify these HRR signatures (oridentify the target aircraft represented by these HRR signals) has beento use a constrained quadratic classifier. This classifier is based oncomputing the mean and variance estimation for each range bin, or columnentry, in the signal. A variant of this technique is to use the meansquare error instead of the variance term.

[0005] This approach works best when there is a small class of targetsto be identified, or classified—such as five or ten targets. Inaddition, this approach does very poorly at rejecting or not declaringon unknown targets. Further, it is not robust due to the fact that ittries to match range bins (column entries) in the signal which containlittle or no information about the target. Typically, these range binsare at the beginning or at the end of the signal.

[0006] It has become apparent that there was room for significantimprovement in the area of statistical pattern recognition. Applyingemerging machine intelligence and data mining techniques to overcome theerrors with estimations and assumptions in current statisticalclassifiers is highly desirable.

[0007] Rough set theory is an approach to data mining that has beenaround since the early 1980's. It was believed this theory had thepotential to produce a more robust classifier. Rough Set Theory assumesthat the training data set is all that is known and all that needs to beknown to do the classification problem. Techniques to find the minimalset of attributes (columns, or range bins for the HRR problem example)to do the classification are available in the theory. Further, thetheory should be robust since it will find all the classifiers.

[0008] A workable, robust classifier using machine learning and datamining techniques is needed. Specifically, the approach should determinewhich features, or attributes (columns) are important; generate amultiplicity of classifiers; be robust; and be computationallyappropriate for real world problem solving.

[0009] Once the data is labeled, Rough Set Theory guarantees that allpossible classifiers using that training data set will be found. Thereis no equivalent statement using statistical pattern recognitiontechniques that can be made. However, in the known Rough Set Theorymethod, generating all the classifiers is an NP-hard (non-polynomialtime complexity) problem. In summary, all known methods are eithersubject to error, are computationally inefficient and thereforeinappropriate for large problem sets, or both.

[0010] The present invention overcomes the above-described problems andothers. It provides a computationally efficient, robust classificationsystem and method that can be used on a wide variety of patternrecognition problems and for other types of data mining tasks.

SUMMARY OF THE INVENTION

[0011] According to one embodiment of the present invention, a systemfor identifying an object represented by a signal is provided. Thesystem includes a computer system, training and testing information datasets, a labeler, a reduct classifier and a reduct classification fuser.The data sets are in tabular form with multiple rows and columns ofdata. Each row represents a signal—one or more known signals, orobjects, in the training data set, and one or more signals, or objects,to be identified or classified in the testing data set. The systemprovides more accurate solutions for the object classification problemby use of the reduct classification fuser, as will become apparent froma reading of the detailed description and figures that follow.

[0012] According to another aspect of the invention, a partitioner and acolumn selector are provided which enable larger, problems (which wouldbe computationally intractable with current methods) to be solvedquickly, even with standard, commercially-available, personal computers.

[0013] In accordance with a more limited aspect of the presentinvention, the system includes normalization logic to reduce the effectsof scale between and within a signal. Furthermore, depending on the databeing identified or classified, a wavelet transformer may be provided towavelet transform each signal in a partition using a multi-level wavelettransform. The wavelet coefficients become additional signal attributesthat are appended to the end of the signals thereby creating morecolumns of data for each row.

[0014] According to another embodiment of the present invention, asystem for classifying a signal is provided. The system includes acomputer system, a training and a testing data set and a reductclassification fuser. The reduct classification fuser combines otherwisemarginal reducts to gain a better result for the signals classified. Thedata sets are presented in tabular form with each data set having aplurality of rows and columns of data. Each of the rows represents asignal.

[0015] As in the first embodiment, the system can also include a corecalculator, a labeler, normalization logic, a partitioner and a wavelettransformer. The labeler can include fuzz factor logic (used on the testor actual data to be classified) that will be further described below.

[0016] A method of determining what classification a signal belongs tois also provided by the present invention. The method includes the stepsof: providing a training information set; binary labeling the signals ofthe training information set; selecting a subset of the columns of thetraining information set that were binary labeled; calculating thereducts; determining the classification of the test signals using thereducts calculated for the training information set; and determining afinal classification of each of the testing signals by combining, orfusing, the separate reduct classifications for each of the testsignals. As will be explained in more detail below, this method ofsignal classification provides a better, more efficient and robustresult than known Rough Set Theory methods or statistical methods.

[0017] In accordance with another aspect of the invention, a method stepfor partitioning the training information set into a plurality ofpartitions of columns is provided. This allows for more computationallyefficient problem solving.

[0018] In accordance with a more limited aspect of the presentinvention, the method can include the step of binary labeling each ofthe test signals which may include the substep of using fuzz factorlogic to screen columns from the classification where the labeling maybe in doubt due to round-off error, noise, or other effects.

[0019] One advantage of the present invention is that the systems andmethod can automatically solve pattern recognition problems with highdimensional data sets by reducing computational cost.

[0020] Another advantage of the present invention is that the systemsand method provide better accuracy of the classification solutions.

[0021] Yet another advantage of the present invention is that thesystems and method are directly transferable to real time computinghardware structures, although a particular hardware organization is nota subject of this invention.

[0022] Still further advantages of the present invention will becomeapparent to those of ordinary skill in the art upon reading andunderstanding the following detailed description of the preferredembodiments.

BRIEF DESCRIPTION OF THE DRAWINGS

[0023] In the accompanying drawings which are incorporated in andconstitute a part of the specification, embodiments of the invention areillustrated, which, together with a general description of the inventiongiven above, and the detailed description given below, serve to examplethe principles of this invention.

[0024]FIG. 1 is an exemplary overall system diagram of a system foridentifying an object, or classifying a signal, in accordance with thepresent invention;

[0025]FIG. 2 is an exemplary diagram illustrating the block partitioningmethod of four exemplary partitioning schemes for partitioning columnsof attributes of the training information data set in accordance withthe present invention;

[0026]FIG. 3 is an exemplary diagram illustrating the interleavepartitioning method of four exemplary partitioning schemes ofpartitioning columns of attributes of the training information data setin accordance with the present invention;

[0027]FIG. 4 is an exemplary overall process flow diagram of a methodfor determining the classification of a signal, in accordance with thepresent invention;

[0028]FIG. 5 illustrates an example of sample HRR data in tabular form;

[0029]FIG. 6 illustrates the sample data of FIG. 5 after it has beenlabeled;

[0030]FIG. 7 illustrates ambiguous signals of the labeled data of FIG.6;

[0031]FIG. 8 illustrates the equivalence classes of labeled signals ofFIG. 6 after removing ambiguous signals;

[0032]FIG. 9 is illustrative of how the core is computed for the labeleddata of FIG. 6;

[0033]FIG. 10 is also illustrative of how the core is computed for thelabeled data of FIG. 6.

DETAILED DESCRIPTION OF ILLUSTRATED EMBODIMENT

[0034] The following includes definitions of exemplary terms usedthroughout the disclosure. Both singular and plural forms of all termsfall within each meaning:

[0035] “Software”, as used herein, includes but is not limited to one ormore computer executable instructions, routines, algorithms, functions,modules or programs including separate applications or from dynamicallylinked libraries for performing functions as described herein. Softwaremay also be implemented in various forms such as a servlet, applet,stand-alone, plug-in or other type of application.

[0036] “Logic”, as used herein, includes but is not limited to hardware,software and/or combinations of both to perform one or more functions.

[0037] “Network”, as used herein, includes but is not limited to theinternet, intranets, Wide Area Networks (WANs), Local Area Networks(LANs), and transducer links such as those using Modulator-Demodulators(modems). “Internet”, as used herein, includes a wide area datacommunications network, typically accessible by any user havingappropriate software. “Intranet”, as used herein, includes a datacommunications network similar to an Internet but typically havingaccess restricted to a specific group of individuals, organizations, orcomputers.

[0038] “Object”, as used herein, includes any item to be determined in apattern recognition task. It is synonymous with a “signal” to beclassified, or identified. The object may be a target, from radarimaging data, or any other item that can be represented as aone-dimensional signal for a recognition task to be solved. Fingerprintrecognition, voiceprints, sonar data, etc., are all possible uses forthe inventive method and systems described and claimed herein. Anyobject that can be readily represented as a one-dimensional signal, orbe mathematically approximated by one or more one-dimensional signals,is a candidate problem that can be solved with the systems and methoddescribed and claimed herein based on rough set theory.

[0039] The invention presents a new organization of knowledge discoveryin the information system described by a rough set theory and thereforeis applicable to a number of pattern recognition problems like speechrecognition, character recognition, and fingerprint recognition. It isalso applicable to data mining and knowledge discovery in data baseswhere knowledge is described by a discrete set of features.

[0040] The determination of classifiers using rough set theory is adifficult problem whose computational expense increases exponentiallywith the number of attributes. The invention describes a partitioningmethod which improves computational efficiency for rough set analysis.As a result of partitioning, a set of classifiers is created. A methodof fusing the individual classifiers is developed. The fused classifieroutperforms any of the individual classifiers.

[0041] Thus, the invention has two distinctive features—(1) it providesmeans to solve pattern recognition problems with high dimensional datasets by reducing computational cost; and (2) it provides better accuracyof the solution. The method is directly transferable to a real timecomputing hardware structure, although a particular hardwareorganization is not a subject of this invention.

[0042] The inventive method and systems was driven by a need to createan efficient hardware realization for a target recognition system, aswill be further described below, but again we stress that is but oneexample of the use of the inventive systems and method described herein.In summary, existing statistical methods suffer from a lot of problemswhen used for high dimensional data sets. This invention uses a localgroup of features for the object recognition task, that otherwise may beeasily overlooked when statistical characterization of the recognitiontask takes place in the entire set of features. The invention describedfurther below uses the idea of marginal classifiers and fusing them toachieve a more accurate result. In an example of the inventive systemsand method that will be further described herein, the MATLAB® programwas used for rough set generation and a High Range Resolution radar dataset for a target recognition problem was solved.

[0043] This invention performs a function known as data mining. Datamining uses a set of information (training data set, or traininginformation set) that is assumed to be all that is known. Theinformation is organized as a table or array. Each column of the tale isan attribute (something known about the item to be classified) and therows represent all the attributes that are associated with a givenobject or class and are called signals. Each signal in the training setmust have a correct classification associated with it. This inventionwill determine the sub-set of attributes that are capable of determiningwhich class a signal belongs to. In other words, the information hasbeen reduced without giving up the classification ability. This processis automatic and requires no human intervention or intuition. Thisinvention is applicable to any sort of classification problem(especially to any 1-D type of signal such as high range resolutionradar) where the problem can be represented as a table of attributes.

[0044] The invention can be implemented on a general-purpose computersystem, such as 100 in FIG. 1, and could be written in the language ofthe MATLAB® programming tool, as an example. This system has limitationsimposed by constraints of time, memory and problem size. Problemsconsisting of multiple partitions using approximately 50 attributes and5000 exemplars each are solved in a reasonable length of time. Thisrequires a processor (Pentium II class) speed of at least 366 MHz andmemory of at least 256 MB. Of course, it will be appreciated that fastercomputer systems 100 can solve larger problems and this size problem isjust exemplary and is not in any way limiting on the systems and methoddescribed and claimed herein.

[0045] In the past, in order to determine the reduced sets of attributesin accordance with rough set theory, all combinations of attributes hadto be tried. This process grew exponentially with the number ofattributes according to the following formula:$\sum\limits_{k = 1}^{n}\frac{n!}{{k!}{\left( {n - k} \right)!}}$

[0046] so for example, with n=29, the total number of possible reductsto be calculated is 536, 870, 911. For our HRR radar data for aircraftATR example, we are interested in N=50 or more. For many real world sizeproblems, the time required for these full calculations is prohibitive.Known rough set theory is not very applicable to high dimension data setproblems. This invention, through labeling, partitioning, use of aninnovative method of determining the reduced set, and fusing thepartition results, has changed the time complexity to be quadraticinstead of exponential. This permits real world problems to be solved.

[0047] Illustrated in FIG. 1 is an exemplary overall system diagram inaccordance with the present invention that determines the identificationof an object or the classification of a signal. A computer system 100includes one or more computers that run software to process information.The computer or computers can be stand-alone, or the user interface 10may be connected via a network 20. For example, there may be a userinterface application (not shown) that is programmed to take the userthrough the object identification tasking and therefor could function asa pre-processor, a post-processor, or both, as is known in the computingarts.

[0048] A training information data set 105 and a testing informationdata set 105 are provided. The training and testing information datasets 105 are presented to the computer system 100 as an array, or table.The values in a row are associated with each other and are called asignal. The values in each column of a signal can come from any sourceas long as the values are related to the classification of that signal.These column values are attributes of the signals. FIG. 5 represents asmall sample of ten signals representing three different targets for aHigh Range Resolution (HRR) radar classification problem, as an exampleof the types of problems that can be solved with the inventive methodand systems described herein. Only four attributes, or columns of thedata, are shown. These are referred to as range bins for the HRR data.

[0049] For training and testing purposes, the correct classification ofthe training information data set's 105 signals must be provided. Whenthe system is in use, a test signal is provided from the testinginformation data set 105 and the computer system 100 provides theclassification if possible.

[0050] The computer system 100 can use normalization logic 110 tonormalize each signal in the training information data set 105, therebyreducing the effects of scale between and within a signal. Thenormalization process used is of course dependent on the particularapplication, as will be appreciated by one of skill in the art. For HRRradar signature data, where the signals values can be integers in therange of 0 to 255 a 2-Norm can be used. The 2-Norm is defined as:$N = \left( {\sum\limits_{i}{y_{i}}^{2}} \right)^{\frac{1}{2}}$

[0051] where y_(i) are the attribute (column) values of a signal.

[0052] In order to handle large data sets and to obtain the bestresults, the training information data set or training data 105 can bepartitioned. A partitioner 115 is provided for this function and thereare at least two types of partitions that can be used, block andinterleave.

[0053] Referring now to FIG. 2, an example of block partitioning for anexemplary signal with 128 attributes (columns) would use the first 64columns and the last 64 columns for two partitions. For four partitionsthere would be four partitions of 32 columns each, columns 1 to 32,columns 33 to 64, columns 65 to 96 and columns 97 to 128. For a largernumber of partitions a similar procedure would be followed. Thepartitioner 115 performs this function for the computer system 100.

[0054] Referring now to FIG. 3, an example of interleaved partitioningis illustrated that is a bit more complex. For two divisions, onepartition would consist of the even numbered columns and the seconddivision the odd numbered columns. If there are four partitions, thefirst partition would consist of columns numbered 1, 5, 9, 13, etc. Thesecond partition would consist of columns numbered 2, 6, 10, 14, etc.The third partition would consist of columns numbered 3, 7, 11, 15, etc.The fourth partition would consist of columns numbered 4, 8, 12, 16,etc., up to column 128. For a larger number of partitions, a similarprocedure would be followed (FIG. 3).

[0055] A wavelet transformer 120 can be provided for use by the computersystem 100. For each partition (see FIGS. 2 and 3), each signal in thepartition can be wavelet transformed using a multi-level wavelettransform. Whether or not this step is performed is dependent on thedata being classified. For signals where the various values areseparated in time (frequency), wavelet transformation is valuable. Thewavelet chosen has been determined to not be important so a Haar wavelet(the simplest) is used. The wavelet coefficients become additionalsignal values and are appended to the end of the signal creating morecolumns in each row. Wavelet transformation produces an approximation ofthe original signal (which contains half as many values as the signalitself) and a detail of the signal (which contains half as many valuesas the signal itself). The approximation is just that, an approximationof the original signal. The detail of the signal has the nuances of thesignal. The approximation of the signal is coarse; the detail of thesignal has the finer features. Multi-level wavelet transformation meansthat a wavelet transformation is performed on the values of theapproximation of the signal and then on the values of the detail of theoriginal signal. This process continues until there are insufficientvalues to do a wavelet transformation.

[0056] As an example of partitioned data being wavelet transformed via acomputer system 100 using a wavelet transformer 120 with the appropriatelogic for a Haar wavelet transform, HRR signals having 128 range bins,or columns, become signals with 1,024 pseudo range bins. A signal with64 range bins (2 partitions) becomes a signal with 448 pseudo rangebins. The signals with 32 range bins (4 partitions) become signals with192 pseudo range bins. And finally, signals divided into eightpartitions and thus having 16 range bins, become signals with 80 pseudorange bins.

[0057] For a rough set analysis, the signals, now supplemented by thewavelet coefficient, must be labeled. An analogy to fuzzy sets using apicture would be that fuzzy sets are concerned with how gray the pixelsare whereas rough sets are concerned with how large the pixels are.Computer system 100 includes a labeler 125 for this purpose. Labeler 125has a logic that uses a multi-class entropy method. Each column of thetraining set of signals is searched sequentially to establish theoptimum point (threshold) which best separates signals of the varioustraining classes. The quality of partition is measured by the entropybased information index defined as follows: $\begin{matrix}\quad & \quad & \quad & \quad & \quad & {I = {1 - \frac{\Delta \quad E}{E_{\max}}}} & \quad & \quad & \quad & \quad & \quad & \quad \\{where} & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad \\\quad & \quad & \quad & \quad & \quad & {{\Delta \quad E} = {{- {\sum\limits_{a = 0}^{1}{\sum\limits_{c = 1}^{n_{c}}{p_{a\quad c}{\log \left( p_{a\quad c} \right)}}}}} + {\sum\limits_{a = 0}^{1}{p_{a}{\log \left( p_{a} \right)}}}}} & \quad & \quad & \quad & \quad & \quad & \quad \\{and} & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad & \quad \\\quad & \quad & \quad & \quad & \quad & {E_{\max} = {- {\sum\limits_{c = 1}^{n_{c}}{p_{c}{\log \left( p_{c} \right)}}}}} & \quad & \quad & \quad & \quad & \quad & \quad\end{matrix}$

[0058] a is the logic function of the signal column and is equal to 1 ifthe value exceeds the threshold and 0 otherwise, n_(c) is the number ofclasses in the training set, p_(c), p_(a), p_(ac) are probabilities ofeach class, attribute probability and joint probabilities respectively.

[0059] An example of labeling of the training signals is shown in FIGS.5 and 6. FIG. 5 represents a sample of HRR Data for a traininginformation data set (such as 105 in FIG. 1). Ten signals are shown andthey correspond to three known targets. Labeling data with a labeler 125makes data easier to handle and helps bring out patterns and trends.However, once data is labeled, some discrimination power is lost. Ingeneral, this is not critical and is even desirable. For example, carsare often labeled into categories such as subcompacts, compacts,mid-sized cars, etc. People are categorized in age ranges and givenlabels such as children, adolescents, young adults, middle-aged andseniors.

[0060] An example of the labeled HRR data of FIG. 5 is found in FIG. 6.Any value in the table of FIG. 5 less than 0.25 is labeled with a 1; anyvalue between and including 0.25 and 0.45 is labeled with a Z; and anyvalue greater than 0.45 is labeled with a 3.

[0061] A column selector 130 is used by the computer system 100 toselect a subset of the best columns. For example, in our HRR example, wewould typically desire the 50 best columns, meaning the columns thathave the highest information index, to be selected. These are thecolumns that should do the best job of classifying the signals oridentifying the targets. The number 50 was chosen as this is the largestpractical size that can be computed on 400 MHz processor with 256MB ofmemory. Faster computers with larger memories would permit a much largernumber of columns to be considered for real time analysis.

[0062] Next, computer system 100 includes duplicate and ambiguous signalremoval logic 135 to remove duplicate and ambiguous signals. Thisapplies only to training information data set 105. Duplicate signals aredefined as signals that are identical and are from the sameclassification. Ambiguous signals are signals which are identical andfrom different classifications. FIG. 7 shows that rows 7 and 10 from theFIG. 6 example data are ambiguous, since both have the same attributes,or column entries, but are associated with different targetclasses—namely target 2 for signal 7 and target 3 for signal 10. Afterlabeling and selecting a subset of the signal columns it is likely thatthere will be duplicate and ambiguous signals. Ambiguous signals areconfusing as the same signal is from two different classes. Therefore,both signals are removed. FIG. 8 shows equivalence classes made up fromthe signals seen in the labeled data of our FIG. 6 example. Duplicatesignals in the equivalence classes can be removed to save computationtime.

[0063] Computer system 100 next uses a core calculator 140 forcalculating the core, as will be further described below. The set oftraining signals is sometimes referred to as an information systemcomposed of attributes (columns) and a decision attribute (a columnwhich contains the correct classification of the signal or row). Afterlabeling, if it is possible to properly classify all signals using asub-set of the attributes, that sub-set of attributes is called a reduct(a reduction in the size of the information system without losing anyinformation; the ability to classify all the signals). It should benoted that there may be many reducts for a given information system. Theattributes that are common to all reducts are known as the core. Thecore is determined by removing one column at a time. After the column isremoved, the training set is examined to determine if there are anyambiguous signals. If there are, that means that that column (orattribute) is the only column of values that can distinguish theambiguous signals. It must therefore be part of the core. For example,FIG. 9 shows that by removing Column 1, or range bin 1, signals 6 and 8are ambiguous. Therefore, Column 1 is part of the core. Whereas in FIG.10, removal of range bin 2 does not result in any ambiguous signals.Therefore, range bin 2 is not part of the core. This process continuesuntil each column has been examined. The set of all columns, which whenremoved resulted in ambiguous signals, comprises the core.

[0064] Computer system 100 then uses a reduct calculator 145 calculatingthe reducts. Starting with the core, columns are added one at a time andthe signal comprised of only those columns is examined to determine howmany ambiguous signals there are. The set of all columns with theminimum number of ambiguous signals is set aside. If there is only onecolumn with the minimum number of ambiguities that column is added tothe core set of columns. The process repeats with the remaining columns.When the number of ambiguities becomes zero, that set of columnscomprise a reduct. We now go back and replace the columns where therewere several columns with the same number of minimum ambiguities andcontinue the process to determine the other reducts. Reductdetermination is also done in known rough set theory.

[0065] Once all the reducts are calculated for the partitioned andreduced training information data set 105, computer system 100 can beused to test the performance of the full set of training signals in thetraining information data set 105. The full training set includingduplicate and ambiguous signals is tested to determine performance ofeach reduct. Obviously there will be some misclassifications due toambiguous signals. The reduced set of training signals (withoutduplicates and ambiguities), the reducts, and the performance of eachreduct on the full training set is saved. This comprises theclassification system. When classifying, a reduct specifies whichcolumns of a signal are to be used for classification. These columns areselected from the training set 105. If a match is found then the signalis assigned the class of the training signal it matches. If no match isfound the signal is marked as unclassified. In the training and testsets 105, where the correct answer is known, the performance of a reduct(a value called the probability of correct classification, Pcc) can becalculated from a confusion matrix. A confusion matrix C is constructedas follows. For each known signal, if the signal is known to be of classa and the reduct classifies it as class a then C_(aa) is incremented byone. If the reduct classifies the signal as class b then C_(ab) isincremented by one. Pcc is then computed as:${Pcc} = \frac{\sum\limits_{i = 1}^{\# \quad {classes}}C_{ii}}{\sum\limits_{i = 1}^{\# \quad {classes}}{\sum\limits_{j = 1}^{\# \quad {classes}}C_{ij}}}$

[0066] The test signals may also be normalized, such as withnormalization logic 110, and wavelet transformed, such as with a wavelettransformer 120, by computer system 100 as already described above forthe training signals of the training information data set 105.

[0067] The labeling of the test signals 105 is slightly different thanfor the training set 105. The computer system 100 uses a labeler 125 aswell as fuzz factor logic 126 to label the test signals of the test dataset 105. Each column of the training set had a threshold valuedetermined by information index. The training threshold is used forlabeling the test set. However, if the signal value in the test setfalls too close to the threshold value it is given a label associatedwith “don't care”. This means that this value will not be used in theclassification processing of that signal. This “don't care” region isestablished by the user with a ‘fuzz factor’. If the minimum value of acolumn from the training set is called y_(min), the maximum value calledy_(max), the threshold value called y_(t), and the fuzz factor called ƒthen the don't care distance δ is defined as:δ = f * min (y_(t) − y_(min), y_(max) − y_(t))

[0068] and the “don't care” region which will receive the special labelis defined as:

y_(t)±δ

[0069] The last step in the classification, or object identification,performed by computer system 100 is to use a novel Reduct ClassificationFuser 150 having logic to fuse or combine what may be some otherwisemarginal reducts. By this system and method, the combination of themarginal reducts, as well as some strong ones, results in an improvedresult for the final classification of each signal or object in the testdata set 105. Each test signal is evaluated using every reduct todetermine classification. All of these classifications are combined,based on the reduct's performance on the training set, to yield thefinal classification. The formula for combining the various results is:$W_{t} = {1 - \frac{{Pcc}_{\max} + {\left\lbrack {\sum\limits_{i = 1}^{n}\left( {1 - {Pcc}_{i}} \right)} \right\rbrack \left( {1 - {Pcc}_{\max}} \right)}}{\sum\limits_{i = 1}^{n}\frac{1}{1 - {Pcc}_{i} + ɛ}}}$

[0070] Using this, each signal is given W_(t) scores associated witheach possible class. The final classification is assigned to the classwith the highest W_(t) value if that value exceeds a threshold value setby the user.

[0071] Illustrated in FIG. 4 is an exemplary methodology for determiningwhat classification a signal belongs to. Of course, as mentioned above,this could be the identification of an object, which may be representedas a signal. The illustrated method in FIG. 4 could be implemented on acomputer system such as 100 in FIG. 1, with a user interface (10 in FIG.1), and the appropriate software and logic for encoding and performingthe steps as will be described below. The blocks shown representfunctions, actions or events performed therein. It will be appreciatedthat computer software applications involve dynamic and flexibleprocesses such that the illustrated blocks can be performed in othersequences different than the one shown. It will also be appreciated byone of ordinary skill in the art that the software of the presentinvention may be implemented using various programming approaches suchas procedural, object oriented or artificial intelligence technique.

[0072] The methodology of FIG. 4 will be described with additionalreference to FIG. 1. The method is applicable to high dimensional datasets using rough set theory or data mining, that heretofore werecomputationally too large to handle by any known methods.

[0073] The method steps represented in FIG. 4 have been described above,in the description of the system shown in FIG. 1. So as not to beredundant, we will summarize the method steps below.

[0074] Training signals 205 are provided in the form of a table or anarray, similar to the training information data set 105 in FIG. 1. Thetable can have a plurality of columns and rows. Attributes, or features,of the training signals 205, or classification, are column entries forthe table. Each row represents a signal and comprises all the attributesof the classification.

[0075] The training signals may then be normalized as is shown in block210 labeled “normalize signals.” This method step may be done inaccordance with the normalization logic 110 described above for the FIG.1 system. Of course, as detailed above, different data signals may benormalized differently, as will be appreciated by one of ordinary skillin the art.

[0076] The method step for partitioning the signals of the trainingsignals 205 into a plurality of partitions of columns may be performedin accordance with the process block 215, labeled “partition signals.”Of course, the number and type of partitioning scheme, such as block orinterleave, described above (and see FIGS. 2 and 3) is up to the user ormethod operator. Again, such inputs, if implemented on a computer system(see 100 in FIG. 1) may be performed by a pre-processor program (notshown) and via a series of queries to the user, as is known in thecomputing arts.

[0077] The method may also include the step of wavelet transforming eachof the signals in each of the partitions. This is illustrated in theFIG. 4 process flow diagram at block 220, labeled “wavelet transformsignals.” Again, as described above, the wavelet chosen may be a Haarwavelet. The additional pseudo attributes of the wavelet transformedsignals are appended to the ends of the signals as additional columnentries.

[0078] The method step of binary labeling the signals is illustrated inblock 225. Binary labeling may be accomplished with the multi-classentropy method as described above for labeler 125 (FIG. 1). The step ofselecting a subset of the plurality of best-performing columns, asdetermined by those having the highest information index alreadydiscussed above for the column selector 130 in FIG. 1, is illustrated inFIG. 4 at block 230.

[0079] Next, the method includes the step of removing the duplicate andambiguous signals from the training signals or training information set.This step is depicted in block 235 of FIG. 4 and was more thoroughlydiscussed above with reference to system component 135 in FIG. 1.

[0080] The step of calculating the core is represented at 240. This wasthoroughly described above with reference to core calculator 140 in FIG.1, as well as the description and FIGS. 9 and 10.

[0081] Calculating each of the reducts is illustrated at 245 in FIG. 4and was described above for the system of FIG. 1 with reference toreduct calculator 145. Again this would be computationally burdensome ifthe training signals were not partitioned, wavelet transformed, labeledand downselected (selection of subset of best performing columns). Viathis method, much larger, high dimensional data set problems can beclassified, or identified, or solved.

[0082] Once the reducts are calculated on the training signals of thepartitions, as described hereinabove, the full training set of signals205 is tested to determine the performance of each reduct. Statedanother way, method step 205 involves performing the classification ofeach signal in the training signals 205, or training information set, byusing each of the reducts. This, too, was described further above. Thereduced set of training signals, the reducts, and the performance ofeach reduct on the full set of training signals 205 can be saved, suchas by block 250 in FIG. 4 labeled “save reducts.” Block 275 indicatesthe end of this portion of the method dealing exclusively with data inthe training signals 205 set.

[0083] Next, a set of test signals are provided as at step 305. They maysimilarly take the form of a table or array with a plurality of columnsof attributes and rows of test signals. These test signals can benormalized and wavelet transformed, such as via method steps indicatedin FIG. 4 at 310 and 320, respectively. This is further described abovefor the system of FIG. 1 and the use of normalization logic 110 andwavelet transformer 120 by the computer system 100 on both the trainingand testing information data sets (signals) 105.

[0084] The test signals 305 may be binary labeled, as at process block325. The binary labeling of the test signals is slightly different thanthat for the training signals, and may use fuzz factor logic to identifya “don't care” region for some column values that lie close to thelabeling point and thus may lead to erroneous labeling due to noise,round off error, or other factor. This was described more fully abovewith reference to FIG. 1 and the system references 100, 125 and 126.

[0085] Now, for each of the reducts, represented at 3MSS, the inventivemethod includes the step of determining a separate reduct classificationfor each of the test signals using each of the reducts 355. This isrepresented at process step 360 in FIG. 4 and was further describedabove with reference to FIG. 1 and computer system 100 and the reductclassification fuser 150. The individual reduct classifications for eachof the test signals are then combined, as at 365, to produce a betterclassification, or identification, result than the individual reductclassifications. By the novel method of combining, or fusing, theotherwise marginal reducts as well as some of the better-performingindividual reduct classifiers, the accuracy of the classification methodand system is greatly increased. Another way to say this is that thisfusing step is for determining a final classification of each testsignal by combining each of the separate reduct classifications for eachof the test signals. This was described more fully above for thecomputer system 100 and reduct classification fuser 150 of FIG. 1.

[0086] The final classification results can be outputted, as at 370, andthe method finished as at end 375.

[0087] With the present invention, an object or signal from a set oftest data full of unknown signals can be identified, or classified,based on information extracted from a known set of training data.Although not unlike Rough Set Theory, or data mining techniques, in thisregard, the present inventive systems and method have a number of highlydesirable features that permit real time use on high dimensional datasets. Accuracy of the solution is substantially strengthened—notsacrificed. The invention has wide ranging applicability to all types ofclassification problems—not just the HRR example for aircraft ATRsystems. The inventive systems and method will automatically solve therecognition, classification or identification problems. The concept ofmulti-class entropy labeling to find the best labeling point in the datahas been shown. The concept of using two data partitioning schemes(block and interleave) to reduce the computational time and improveclassification accuracy was also described and illustrated. This alsomakes the methodology and system less sensitive to noise in the data andmakes the system less sensitive to registration of the data. The methodused to compute the minimal reducts in a reasonable time (quadraticinstead of exponential time complexity) allows larger (real world)problems to be solved. The method used to fuse the results vastlyincreases the accuracy of the system. The method of fuzzifying the testdata when values are close to the dividing point makes the system lesssensitive to noise.

[0088] While present invention has been illustrated by the descriptionof embodiments thereof, and while the embodiments have been described inconsiderable detail, it is not the intention of the applicants torestrict or in any way limit the scope of the appended claims to suchdetail. Additional advantages and modifications will readily appear tothose skilled in the art. The invention, in its broader aspects, is notlimited to the specific details, the representative apparatus, andillustrative examples shown and described. Accordingly, departures maybe made from such details without departing from the spirit or scope ofthe applicant's general inventive concept.

We claim:
 1. A system for identifying an object wherein said object isrepresented by a signal, comprising: a computer system; a traininginformation data set, said training set in the form of a table having aplurality of rows and a plurality of columns, wherein each rowrepresents a signal and each column represents attributes associatedwith each given signal; a labeler; a reduct calculator; a testinginformation data set; and a reduct classification fuser.
 2. The systemof claim 1 further comprising normalization logic for normalizing saidsignals in said training information data set.
 3. The system of claim 2further comprising a partitioner.
 4. The system of claim 3 furthercomprising a wavelet transformer.
 5. The system of claim 1 furthercomprising a column selector.
 6. The system of claim 5 furthercomprising duplicate and ambiguous signal removal logic.
 7. The systemof claim 6 further comprising a core calculator.
 8. The system of claim1 further comprising a core calculator.
 9. The system of claim 1 whereinthe labeler binary labels the training information data set and whereinthe labeler further labels the test information data set.
 10. The systemof claim 6, wherein the computer system tests the performance of thefull training set and wherein the computer system automaticallyidentifies the signals in the test set.
 11. The system of claim 9wherein the labeler includes fuzz factor logic.
 12. A system forclassifying a signal, comprising: a computer system; a training dataset; a testing data set; and a reduct classification fuser, wherein eachof the data sets are in the form of tables of data each having aplurality of rows and columns, each of the rows representing a signal.13. The system of claim 12 further comprising a core calculator andreduct calculator.
 14. The system of claim 13 further comprising alabeler.
 15. The system of claim 12 further comprising normalizationlogic and a partitioner.
 16. The system of claim 15 further comprising awavelet transformer.
 17. The system of claim 12 further comprising auser interface and a network.
 18. The system of claim 14 wherein thelabeler includes fuzz factor logic.
 19. The system of claim 14 furthercomprising a column selector and duplicate and ambiguous signal removallogic.
 20. A data-mining method of determining what classification asignal belongs to, the method comprising the steps of: providing atraining information set in the form of a table of data, the tablehaving a plurality of columns and rows, wherein each column of the tablerepresents an attribute of the classification and wherein each row ofthe table is a signal and represents all the attributes associated witha specific classification; binary labeling the signals; selecting asubset of the plurality of columns, each column in the subset having ahigher information index than any of the remaining columns in theplurality of columns that were not selected; calculating each of thereducts; providing a set of test signals in the form of a table with aplurality of columns of attributes and rows of test signals; determininga separate reduct classification of each of a plurality of test signalsusing each of the reducts; and determining a final classification ofeach test signal of the plurality of test signals by combining each ofthe separate reduct classifications for each of the test signals. 21.The method of claim 20 further comprising the step of binary labelingeach of the test signals.
 22. The method of claim 21 wherein the step ofbinary labeling each of the test signals comprises the substeps of usinga training threshold value determined for each column in the trainingset and establishing a threshold value tolerance bond about eachthreshold value such that a column value falling within a respectivethreshold value tolerance band will not be considered in classifyingthat respective signal.
 23. The method of claim 20 further comprisingthe step of normalizing the training signals.
 24. The method of claim 20further comprising the step of partitioning the training information setinto a plurality of partitions of columns.
 25. The method of claim 24wherein the partitioning step is done via a block partitioning methodwhereby pluralities of adjacent columns are grouped into the partitions.26. The method of claim 24 wherein the partitioning step is done via aninterleave partitioning method whereby pluralities of non-adjacentcolumns are grouped into the partitions.
 27. The method of claim 24further comprising the step of: wavelet transforming each of the signalsin each of the partitions.
 28. The method of claim 20 further comprisingthe step of removing duplicate and ambiguous signals from the traininginformation set.
 29. The method of claim 20 further comprising the stepof calculating the core.
 30. The method of claim 20 further comprisingthe step of test performing the classification of each signal in thetraining information set using each of the reducts.
 31. The method ofclaim 21 further comprising the step of normalizing each of the testsignals prior to accomplishing the step of binary labeling each of thetest signals.
 32. The method of claim 20 further comprising the step ofwavelet transforming each of the test signals after they are normalizedand before they are binary labeled.